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Abstract 

Criticism of Gnutella network scalability has rested on the bandwidth 
attributes of the original interconnection topology: a Cayley tree. Trees, 
in general, are known to have lower aggregate bandwidth than higher 
dimensional topologies e.g., hypercubes, meshes and tori. Gnutella was 
intended to support thousands to millions of peers. Studies of intercon- 
nection topologies in the literature, however, have focused on hardware 
implementations which are limited by cost to a few thousand nodes. Since 
the Gnutella network is virtual, hyper-topologies are relatively unfettered 
by such constraints. We present performance models for several plausible 
hyper-topologies and compare their query throughput up to millions of 
peers. The virtual hypercube and the virtual hypertorus are shown to 
offer near linear scalability subject to the number of peer TCP/IP con- 
nections that can be simultaneously kept open. 



1 Introduction 



The Gnutella network is a class of open source | Gnutella 2002 ] virtual networks 
known as Peer-to-Peer or P2P networks. Compared to the more ubiquitous 
client-server distributed architectures, every P2P node (or servant) can act 
as both a client and a server. Many client-server applications e.g., commer- 
cial databases, have multiple clients (users) accessing a centralized server (see 



e.g., [ Gunther 200C | Chap. 8). Conversely, P2P network applications are usually 
completely decentralized. 

Finding applications that can make efficient use of P2P is the current gating 
factor for their widespread adoption. So far, P2P networks have been employed 
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for such applications as the Napster (www.napster.com) music file-sharing ser- 
vice, and the SETI@Home project (setiathome . ssl .berkeley . edu), although 
those implementations rely on a significant centralized server component. 

The initial release of Gnutella in 2000 led to the perception that the in- 
trinsic architecture may not be capable of scaling to meet the sharing de- 
mands of millions of anticipated Q users. Similar concerns about scalabil- 
ity have arisen in the context of hyp crgrowth traffic impinging on popular 
e-comm erce Web sit es [ Gunthcr 2001 ], Based on measurements of popular- 
queries, | Sripan 2001 1 proposed that Gnutella scaling problems could be amelio- 
rated through the implementation of appropriate caching strategies. Measure- 
ments by [ AdaHub 2000 indicated that there were more readers than writers 
involved in file sharing. They suggested that such a "free ride" could lead to 
higher than expected load on the P2P network thereby degrading its perfor- 
mance as well as increasing its vulnerability to fragmentation. 

A mathematical analysis by Ritter 2001 1 (one of the original developers 
of Napster) presented a detailed numerical argument demonstrating that the 
Gnutella network could not scale to the capacity of the competitor [] Napster 
network. Essentially, that model showed that the Gnutella network is severely 
bandwidth limited long before the P2P population reaches a million peers. In 
each of these previous studies, the conclusions have overlooked the intrinsic 
bandwidth limits of the underlying topology [ Miliar 2002| in the Gnutella net- 
work: a Cayley tree [RaiSlo 1999 1. (See section |2| for the definition) 

Trees are known to have lower aggregate bandwidth than higher dimen- 
sional topologies e.g., hypercubes and hypertori. Studies of interconnection 
topologies in the literature have tended to focus on hardware implementations 
(see e.g., [ pull et al. 199§ , puyya 199Sj j, ]AlmGot 1994) and JPatHcn 1996j ) 
which are generally limited by the cost of the chips and wires to a few thou- 
sand nodes [Gunthcr 2002 1. P2P networks, on the other hand, are intended to 



support hundreds of thousands to millions of simultaneous peers and since they 
are implemented in software, hyper-topologies are relatively unfettered by the 
economics hardware. 

In this paper, we analyze the scalability of several alternative topologies 
and compare their throughput up to 2-3 million peers. The virtual hypercube 
and the virtual hypertorus offer near-linear scalable bandwidth subject to the 
number of peer TCP/IP connections that can be simultaneously kept open. 
We adopt the abbreviation hypernet for these alternative topologies. The as- 
sumptions about the distribution of peer activity arc similar to those employed 
by [Ritter 2001 . This is appropriate since our purpose is to rank the relative 



performance of these hypcrncts rather than to predict their absolute pcrfor- 



1 In 2001, the size of the Napster network was 160,000 simultaneous users, down from a 
peak of 1.6 million reported by Webnoize in February, 2001 

2 At the height of the media attention, Napster's legal problems drove some 50,000 users 
per day over to Gnutella such that peers connected by 56 Kbps phone lines caused the P2P 
network to fragment into disconnected "islands" of about 200 peers. 

3 As the SETICSHome project has demonstrated, 2.8 million desktops (and 10 PetaFLOPS) 
can be harnessed for free. 
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mance. 



2 Tree Topologies 

In the subsequent discussion, the P2P network is treated as a graph i.e., a set 
nodes or vertices connected by a set of edges or links. The nodes correspond to 
network peers and the links to the links to network connections. 

Because the tree structure of the Gnutella network has been such a hidden 
determinant underlying the conclusions drawn in previous scalability studies, 
we commence our performance comparisons by distinguishing clearly among 
the relevant tree topologies. Topologically, all trees are planar and thus have d 
= 2 spatial dimensionality. 



2.1 Binary Tree 

The binary tree is familiar in the computing context by virtue of its ubiquity 



as a parsing and storage data structure [Wirth 1976 . There is a unique root 



node which is connected only to two sibling nodes and each of those siblings 
is connected to another pair of sibling nodes and so on. At each level (h) in 
the tree, there are 2 h nodes. Therefore, the number of nodes grows as a binary 
exponential. Because of its relatively sparse nodal density, the binary tree is 
rarely employed as a bona fide interconnection network. 



2.2 Rooted Tree 

A rooted tree is simply the generalization of a binary tree in which each node 
(other than the root) has a vertex of degree v. The total number of nodes is 
the sum of a geometric series: 

v h - 1 

N bm (h) = (1) 



2.3 Cayley Tree 



A Cayley tree ]RaiSlo 1999 has no root. Recalling the binary tree, what was 



the root of the parent binary tree now has a link to an another binary sub-tree 
of height one less than the parent. All nodes thus become tri-valent with v = 3 
at every level. More generally, for a v-valent tree, the total number of nodes is 
given by: 

N cay (h) = 1 + Y,v(v- l)* 1 - 1 (2) 

and therefore is denser than ( Q). 

This is the central formula used in the scalability analysis of jRitter 2001 1 



The network he analyzed is thus a Cayley tree with vertex degree (v) corre- 



sponding to the number of open network connections per servant. |Rittcr 2001 1 



analyzed valences in the range v = 4 ... 8; the former value being the default 
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setting in the original Gnutella release, and the latter more closely resembling 
the number of peers claimed for the contemporaneous Napster network. 



3 Hypernet Topologies 

An alternative to bandwidth-limited trees is a topology with higher dimension- 
ality. We examine the performance attributes of two hypernets in particular: 
the binary hypercube and the hypertorus, each in d-dimensions. 



3.1 Hypercube 

In a boolean or binary hypercube each node forms the vertex of a d-dimcnsional 



cube [HPCC|. The number of nodes is simply 2 d and the degree of each vertex 
(v) is equal to the dimensionality (d) of the network. Hence, each node can be 
enumerated or addressed using a base-2 (binary) d-digit number. 

Moreover, since neighboring nodes differ in address by only 1 digit, sending 
a message on the hypercube becomes a simple matter of shifting successive bits 
as the binary address passes each node between source and destination. 

In d = 3 dimensions the hypercube is simply a cube. Each vertex has degree 
v = 3, so there are 2 3 = 8 nodes. A 4-dimensional hypercube, can be visualized 
as spatially translating a 3-cube such that the locus of its 4 vertices trace out 
the additional connections. 



3.2 Hyper Tor us 



A d-dimensional hypertorus [HPCC| is a d-dimensional grid with each nodes 
connected to a ring of nodes in each of the d orthogonal dimensions. The 
hypertorus reduces to the binary hypercube when there are only 2 nodes in 
each ring. 

The simplest visualization is, once again, in 3-dimensions. A 2-dimcnsional 
grid is first wrapped about one axis such the edges join to form a tube. The 
tube is wrapped about the orthogonal axis to form a ring such that the open 
ends of the tube become joined. The result is a 3-torus, otherwise known as a 
donut. 

All of these topologies fall into a class known as single stage networks and are 
relatively easy to implement in software. The more exotic topologies, such as 



cube-connected cycles, butterflies and other multistage |AlmGot 1994 networks 



are not considered here because they are likely to be more difficult to implement. 



4 Performance Metrics 
4.1 Network Diameter (6) 

The notion of a network diameter is analogous to the diameter for a circle. 
There, it is the maximum chordal length between two points on the circumfcr- 
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ence. For a network, it is the maximum number of communication links that 
must be traversed to send a message to any node along the shortest path. It 
represents a lower bound on the latency to propagate messages throughout the 
entire network. In 1997 the Web was estimated to comprise more than half a 
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Table 1: Network diameters. 



million sites |Gray 1997 1 . By 2001, it was estimated [OCLC 1991] to have grown 
to 3.1 million publicly accessible sites. 



The diameter of the Web has been estimated [Reka et al. 199£] to be about 
20 hops. If the Web is modelled as a Cayley tree, its height would be half the 
diameter i.e., h = 5/2 = 10 hops. A vertex degree of 5 (connections per node) 
would contain just under half a million nodes while a vertex degree of 6 would 
contain nearly 3 million (2,929,687) nodes. 

4.2 Total Nodes (N) 

The total number of peer nodes in the P2P network. For a binary tree: 



N(h) = J] 2 k - 1 



(3) 



k=l 



For a d-dimensional binary hypercube the number of nodes is 2 d . 

4.3 Path Length 

The path length is the maximal distance between a leaf node and the root. For 
a tree, it is half the diameter. The path length corresponds the peer horizon 
used by [ Rittcr 2001 1 in his analysis. A better measure of network latency is 
the average number of hops (H), which we shall define shortly. 

4.4 Internal Path Length (P) 

The internal path length is the total number of paths between all nodes. For a 
binary tree of depth h, the total number of paths is: 



P(h) = J2 k N ^ 



(4) 



k=l 
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4.5 Average Number of Hops (H) 

Since the network diameter is a maximal distance, it tends to overestimate 
message latency. A better measure is the average number of hops between 
source and destination. This quantity is found by dividing the internal path 
length in (||) by the total number of nodes in (||) 



P 



(5) 



It corresponds to the average number of network hops traversed by a P2P query. 
4.6 Number of Network Links (L) 

This is a measure of the number of physical network links. As shown in Table 
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Table 2: Network links. 



L scales with the number of physical nodes (N) for the topologies we consider. 
4.7 Network Demand (Amfc) 

The transit frequency across a link fu n k is a measure of the average query size 
per link. Under the assumption of uniform message routing, it can be defined 
as: 

H 

(6) 



J link j 



If the latency across a link is denoted by Su n k; then the total service de- 
mand jGunthcr 2000| is: 

Dlink = flink Su n k (7) 

For simplicity, and without loss of generality, we normalize the network demand 
to unit periods (Su n k = 1). 

4.8 Peer Demand (D peer ) 

Similarly, for node latency S peer . Under the assumption of uniform message 
routing: 



1 

I peers — 

and the total peer service demand is: 

, _ Sp eer 

Upeers — ^ 



(8) 
(9) 



G 



Again, we normalize the peer demand to unit periods (S peer = 1) in the 
subsequent discussion. 



4.9 Bandwidth (X) 

It follows from Little's law, U = XD (See e.g., |Gunthcr 200f| p. 44) that when 
any node in the network reaches saturation (U — 1) the maximum in the system 
throughput is determined by: 

(10) 



Max[D peers , D Hnk i , D t 

ink2 j 



The node with the longest service demand D max is the system bottleneck. The 
service demand at the bottleneck therefore determines the maximum system 
throughput. 

With these metrics defined, we are in a position to compare the asymptotic 
performance of each of the topologies described in sections || and ||. 



5 Relative Bandwidth 

Since we are interested in network scalability up to a few million peers, it is 
sufficient to base the comparison on the asymptotic network throughput defined 
in ( |l0| ). In particular, we will rank the above hypernets according to their 
relative maximal bandwidth, 

■^relative — X max (N)/N (11) 

where N is the number of peers in the horizon (Table || at the end of this section) . 
X re iative = 1.0 corresponds to linear scalability since X max = N in (jll|). 



In several respects our approach is similar to that taken by [ Cull et al. 1996[ 
for their LogP model of assessing parallel hardware performance. In both ap- 
proaches, the respective network topology enters into the performance model 
via the network demand defined in ( |^ and ^). 



5.1 Cayley Trees 

First, we consider the relative performance of tree topologies. Fig. |l] shows the 
normalized bandwidths of a 4-th degree rooted tree, a 4-valent Cayley tree and 
an 8-valcnt Cayley tree. 

The 4-valent Cayley tree represents the default peer connectivity in the 
original release of Gnutella. Similarly, the 8-valent Cayley tree corresponds to 
Ritter's comparison with Napster scalability. The curves in Fig. |l| terminate at 
different peer populations because the population is an integral multiple which 
is dramatically affected by the vertex degree and the height of the tree. 

We see immediately that the 8-valent Cayley tree has the greatest bandwidth 
up through 2 million peers. The 4-valent Cayley tree has the lowest bandwidth; 
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Figure 1: Relative throughput of binary and Cayley trees. 



even lower than the rooted tree. This follows from the fact that at its root the 
4-tree has the same connectivity as the 4-Cayley tree but all its descendents have 
vertices of 5 degrees. Even for the 8-Cayley, at 2 million peers the bandwidth 
is less than one quarter of linear scalability. 

5.2 Trees and Cubes 

We next consider the relative performance of high degree trees and hypercubes. 
In particular, Fig. || shows the normalized bandwidths for an 8-Cayley (the 



Figure 2: Relative throughput of Cayley trees and hypercubes. 



best throughput of the trees considered in Fig. [[]), a 20-Cayley, and a binary 
hypercube. The d-dimensional hypercube clearly exhibits superior scalability. 
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5.3 Cubes and Tori 

Of these high-order topologies, the binary hypercube offers linearly scalable 
bandwidth beyond one million active peers (Fig. ||). The 10-dimensional hyper- 
torus has comparable scalability up to one million peers but degrades beyond 
that point. The 3-dimensional hypertorus is also shown for comparison since 



Figure 3: Relative throughput of hypercubes and hypertori. 



that topology has been used in large-scale hardware implementations up to sev- 
eral hundred nodes per cluster (e.g., the Tandem Himalya). 

5.4 Ranked Performance 

The main results of our analysis are summarized in Table ^ which shows each 
of the topologies ranked by their relative bandwidth as defined in (|ll[) . 

The 20-dimcnsional hypercube outranks all other contenders on the basis 
of query throughput. For an horizon containing 2 million peers, each servant 
must maintain 20 open connections, on average. This is well within the capacity 
limits of most TCP /IP implementations ptevens 199C | . 

The 10-dimensional hypertorus is comparable to the 20-hypercube in band- 
width up to an horizon of 1 million peers but falls off by almost 10% at 2 million 
peers. The 10-torus is also arguably a more difficult topology to implement. 

The 20-valent Cayley tree is included since the number of connections per 
peer is the same as that for the 20-cube and the 10-torus. An horizon of 6 hops 
was used for comparison because the peer population is only 144,801 nodes at 
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Tabic 3: Topologies ranked by maximal relative bandwidth. 



5 hops. Similarly for 8-Caylcy, a 9 hop horizon would contain 7.7 million peers. 
These large increments are a direct consequence of the high vertex degree per 
node. 

The 4-Cayley (modeling early Gnutella) and 8-Cayley (modeling the Napster 
population) show relatively poor scalability at 1 million peers. Even doubling 
the number of connections per peer produces slightly better than 50% improve- 
ment in throughput. This confirms the conclusions reached in [Ritter 2001 1 
and, moreover, supports our proposal to consider hypernet topologies. 



6 Conclusions 

Previous studies of Gnutella scalability have tended to overlook the intrinsic 
bandwidth limits of the underlying tree topology. The most thorough and ac- 



curate of these studies is that presented in | Ritter 2001 . Unfortunately, his 
analysis could be accused of straining at a gnat. As a viable candidate for 
massively scalable bandwidth, our analysis demonstrates that trees are dead. 

Conversely, by going to higher dimensional virtual networks (and the hyper- 
cube in particular) near linear scalability can be achieved for populations on the 
order of several million peers each with only 20 open connections. According 
to section |[ this level of scalability would already match the number of nodes 
present in the entire Web. 

The dominant constraint for hardware implementations of high-dimensional 
networks is the cost of the physical wires on the interconnect backplane. Since 
the hypernets discussed here would be implemented in software, no such con- 
straints would prevent reaching the desired level of scalability. In this sense, we 
see hypernets as offering good (g)news for Gnutella scalability. 
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